-
Notifications
You must be signed in to change notification settings - Fork 2.4k
[docs] Add Google-style docstrings for dspy/evaluate/metrics.py #8954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docs] Add Google-style docstrings for dspy/evaluate/metrics.py #8954
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
dspy/evaluate/metrics.py
Outdated
|
|
||
| def EM(prediction, answers_list): # noqa: N802 | ||
| assert isinstance(answers_list, list) | ||
| """Return True if any reference exactly matches the prediction (after normalization). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The opening line should describe what this API is, instead of the API's behavior.
dspy/evaluate/metrics.py
Outdated
| otherwise False. | ||
| Example: | ||
| >>> EM("The Eiffel Tower", ["Eiffel Tower", "Louvre"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't render on mkdocs, let's use the block style, e.g.,:
my_code
dspy/evaluate/metrics.py
Outdated
| >>> EM("paris", ["Paris"]) | ||
| True | ||
| """ | ||
| assert isinstance(answers_list, list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's don't mix fix with docstring changes. Actually this assert statement won't provide more information to users
dspy/evaluate/metrics.py
Outdated
| >>> round(F1("Eiffel Tower is in Paris", ["Paris"]), 2) | ||
| 0.33 | ||
| """ | ||
| assert isinstance(answers_list, list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
dspy/evaluate/metrics.py
Outdated
| float: The highest HotpotQA-style F1 score in [0.0, 1.0]. | ||
| Example: | ||
| >>> HotPotF1("yes", ["no"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use block code
dspy/evaluate/metrics.py
Outdated
|
|
||
| def remove_articles(text): | ||
| return re.sub(r"\b(a|an|the)\b", " ", text) | ||
| return re.sub(r"\\b(a|an|the)\\b", " ", text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to change this?
dspy/evaluate/metrics.py
Outdated
| def answer_exact_match(example, pred, trace=None, frac=1.0): | ||
| """Example/Prediction evaluator for answer strings with EM/F1 thresholding. | ||
| If ``example.answer`` is a string, compare ``pred.answer`` against it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: single backtick around variables: example.answer
dspy/evaluate/metrics.py
Outdated
|
|
||
|
|
||
| def answer_exact_match(example, pred, trace=None, frac=1.0): | ||
| """Example/Prediction evaluator for answer strings with EM/F1 thresholding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is too detailed for the open sentence
…k examples); revert non-doc changes
|
All feedback addressed (concise openings + mkdocs block examples) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty good, thank you for the contribution, LGTM!
|
Thanks! |
|
This is so great I love these |
|
Thanks! |
commit 056d54e Author: Isaac Miller <[email protected]> Date: Wed Oct 29 17:23:09 2025 +0100 fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot (stanfordnlp#8909) * fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot * remove extra logs * Remove log * Fix merge conflict * Remove extra whitespace commit da69f9d Author: TomuHirata <[email protected]> Date: Wed Oct 29 13:23:34 2025 +0900 Update anthropic model name (stanfordnlp#8992) Signed-off-by: TomuHirata <[email protected]> commit aaadf05 Author: Chen Qian <[email protected]> Date: Tue Oct 28 12:21:55 2025 -0700 lints (stanfordnlp#8987) commit e842ba1 Author: eramis73 <[email protected]> Date: Tue Oct 28 02:40:34 2025 +0300 [docs] Add Google-style docstrings for dspy/evaluate/metrics.py (stanfordnlp#8954) * docs(metrics): add Google-style docstrings for public metrics * docs(metrics): address review feedback (concise openings, mkdocs block examples); revert non-doc changes * fixes --------- Co-authored-by: chenmoneygithub <[email protected]> commit 6c43880 Author: TomuHirata <[email protected]> Date: Tue Oct 28 07:21:06 2025 +0900 Cache Ollama to speed up CI (stanfordnlp#8972) * Cache Ollama to speed up CI * fix permission commit 462baef Author: Copilot <[email protected]> Date: Mon Oct 27 11:57:27 2025 -0700 Fix TypeError when tracking usage with Anthropic models returning Pydantic objects (stanfordnlp#8978) * Initial plan * Fix TypeError when merging Anthropic CacheCreation objects in usage tracker Co-authored-by: TomeHirata <[email protected]> * Enhance _flatten_usage_entry to convert Pydantic models on first add Co-authored-by: TomeHirata <[email protected]> * Fix potential TypeError when both usage entries are None Co-authored-by: TomeHirata <[email protected]> * simplify * small fix * lint * robust version handling --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: TomeHirata <[email protected]> Co-authored-by: chenmoneygithub <[email protected]> commit 9b467b5 Author: Noah Ziems <[email protected]> Date: Mon Oct 27 13:32:07 2025 -0400 Add Disable Fallback Option in ChatAdapter (stanfordnlp#8984) commit bf022c7 Author: Lakshya A Agrawal <[email protected]> Date: Sat Oct 25 23:37:42 2025 +0530 Update gepa[dspy] dependency version to 0.0.18 (stanfordnlp#8969) * Update gepa[dspy] dependency version to 0.0.18 * Update pyproject.toml * fix test --------- Co-authored-by: TomuHirata <[email protected]>
commit 31b96af Author: Dushmanta <[email protected]> Date: Thu Oct 30 13:52:40 2025 +0530 fix: broken PyPI downloads badge from pepy.tech in README and docs home page (stanfordnlp#8995) * fix: update broken pypi download badge in readme * fix: update broken pypi download badge in docs home page commit 056d54e Author: Isaac Miller <[email protected]> Date: Wed Oct 29 17:23:09 2025 +0100 fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot (stanfordnlp#8909) * fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot * remove extra logs * Remove log * Fix merge conflict * Remove extra whitespace commit da69f9d Author: TomuHirata <[email protected]> Date: Wed Oct 29 13:23:34 2025 +0900 Update anthropic model name (stanfordnlp#8992) Signed-off-by: TomuHirata <[email protected]> commit aaadf05 Author: Chen Qian <[email protected]> Date: Tue Oct 28 12:21:55 2025 -0700 lints (stanfordnlp#8987) commit e842ba1 Author: eramis73 <[email protected]> Date: Tue Oct 28 02:40:34 2025 +0300 [docs] Add Google-style docstrings for dspy/evaluate/metrics.py (stanfordnlp#8954) * docs(metrics): add Google-style docstrings for public metrics * docs(metrics): address review feedback (concise openings, mkdocs block examples); revert non-doc changes * fixes --------- Co-authored-by: chenmoneygithub <[email protected]> commit 6c43880 Author: TomuHirata <[email protected]> Date: Tue Oct 28 07:21:06 2025 +0900 Cache Ollama to speed up CI (stanfordnlp#8972) * Cache Ollama to speed up CI * fix permission commit 462baef Author: Copilot <[email protected]> Date: Mon Oct 27 11:57:27 2025 -0700 Fix TypeError when tracking usage with Anthropic models returning Pydantic objects (stanfordnlp#8978) * Initial plan * Fix TypeError when merging Anthropic CacheCreation objects in usage tracker Co-authored-by: TomeHirata <[email protected]> * Enhance _flatten_usage_entry to convert Pydantic models on first add Co-authored-by: TomeHirata <[email protected]> * Fix potential TypeError when both usage entries are None Co-authored-by: TomeHirata <[email protected]> * simplify * small fix * lint * robust version handling --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: TomeHirata <[email protected]> Co-authored-by: chenmoneygithub <[email protected]> commit 9b467b5 Author: Noah Ziems <[email protected]> Date: Mon Oct 27 13:32:07 2025 -0400 Add Disable Fallback Option in ChatAdapter (stanfordnlp#8984) commit bf022c7 Author: Lakshya A Agrawal <[email protected]> Date: Sat Oct 25 23:37:42 2025 +0530 Update gepa[dspy] dependency version to 0.0.18 (stanfordnlp#8969) * Update gepa[dspy] dependency version to 0.0.18 * Update pyproject.toml * fix test --------- Co-authored-by: TomuHirata <[email protected]>
This PR adds Google-style docstrings to public metrics in
dspy/evaluate/metrics.py.resolve #8953
cc @chenmoneygithub